46 - Lecture_10_3_Conjugate_Directions_Descent [ID:39330]
50 von 280 angezeigt

Hi. The next step is to understand what conjugacy means, and let's recall first that orthogonality

means, so u is orthogonal to v, means that the scalar product u-transposed times v is

equal to zero. So u and v are called orthogonal. And then the definition, I don't know the

number actually, let's call it 4.1, might be wrong, let's call it 4.1, is that u and

v are called b-orthogonal. And b has to be an spd matrix, so symmetric and positive definite.

If and only if the following is true, u-transposed b v is equal to zero. And this is also equivalent

to saying that v-transposed b times u is equal to zero. And notation is, for that is notation,

u is b-orthogonal to v, and we call u and v b-orthogonal or conjugate. Of course, if

we say conjugate, then we have to be sure that we know which matrix b we are meaning.

So in a specific setting, there will usually be a unique matrix b, and then we will just

call it conjugate. If we want to stress exactly which matrix we mean, we call it b-orthogonal.

Okay, now what does that mean? So what's the geometric idea of that? So of course orthogonality,

so the usual notion of orthogonality is if you want identity orthogonality, if b is the

identity matrix. But what if b is, for example, a matrix such that level sets of this quadratic

look like those ellipsoids. So if b is an spd matrix, then you can always draw those

ellipsoids because level sets of u-transposed b u will look like ellipsoids. And it's actually

quite easy to explain how b-orthogonal vectors look like. And for that we switch again to

this website. And so these are, let's maybe zoom a bit in here. The idea is to look at

those level sets and stretch this image such that those level sets become circles. So you

can see how we're kind of pulling here and here for as long as we need until those ellipsoids

become circles. So as you can see everything is stretched out like, so if we were to print

this on a bubblegum sheet, so to speak, you could do this in principle. And how do b-orthogonal

vectors look like? And well, it's like this. So two vectors are b-orthogonal if and only

if in the stretched version of reality they are actually orthogonal. So all those pairs

of vectors, the position actually doesn't matter. So we could have taken this blue pair

of vectors, so this vector pointing to the right, this vector pointing to the lower right.

We could move this along the space so it doesn't depend on the spatial position where we attach

them. But if we take them, we draw them on this bubblegum sheet, we apply this stretch,

and if they look orthogonal in this plot, then they are b-orthogonal. So as you can

see sometimes conjugacy or b-orthogonality is the same as orthogonality in the usual

sense. So if you take those vectors here and we stretch them, we get those vectors and

they look orthogonal and they look orthogonal. But sometimes it doesn't look like they're

orthogonal. For example, these two vectors, they have a sharp angle between them. But

stretching these two vectors gives us a right angle here, so they're orthogonal in this

picture, which means that these two vectors are b-orthogonal. Similar here, so they are

at an obtuse angle, so the angle between those two vectors is larger than 90 degrees. But

stretching this picture, those two vectors are again at right angle to each other. And

if you now look at this blue set of vectors, we can change the orientation of them and

you can see that the angle changes between obtuse like this and right angle and a sharp

angle like this. So b-orthogonality depends on the direction of those two vectors. It's

not a fixed angle between those two vectors, but it depends on the direction of one of

those vectors, what the angle has to be. In my opinion, the nicest explanation is two

vectors are b-orthogonal if stretching the domain such that lambda sets of this quadratic

form are circles, those two vectors appear to be orthogonal in this stretched domain.

So that's conjugacy. And this is kind of the right geometrical idea here. So if we had

– well, let me explain this differently. Switch back to the lecture. So another throwback

to gradient descent, if b is the identity matrix in, let's say, n times n, then the

level sets are circles, so not ellipses, not ellipsoids, they are circles or spheres, hyperspheres,

whatever. So it looks like this. This is the domain, the minimum is somewhere and then

level sets are spheres along this minimum. And if we now take any point, it doesn't

Teil einer Videoserie :

Zugänglich über

Offener Zugang

Dauer

00:52:13 Min

Aufnahmedatum

2021-12-14

Hochgeladen am

2021-12-14 12:46:04

Sprache

en-US

Einbetten
Wordpress FAU Plugin
iFrame
Teilen